Network proximity analysis as a theoretical model for identifying potential novel therapies in primary sclerosing cholangitis

Primary Sclerosing Cholangitis (PSC) is a progressive cholestatic liver disease with no licensed therapies. Previous Genome Wide Association Studies (GWAS) have identified genes that correlate significantly with PSC, and these were identified by systematic review. Here we use novel Network Proximity Analysis (NPA) methods to identify already licensed candidate drugs that may have an effect on the genetically coded aspects of PSC pathophysiology. Over 2000 agents were identified as significantly linked to genes implicated in PSC by this method. The most significant results include previously researched agents such as metronidazole, as well as biological agents such as basiliximab, abatacept and belatacept. This in silico analysis could potentially serve as a basis for developing novel clinical trials in this rare disease. Supplementary Information The online version contains supplementary material available at 10.1186/s12920-024-01927-2.


Introduction
Primary Sclerosing Cholangitis (PSC) is a rare, progressive cholestatic autoimmune liver disease that leads to chronic liver injury, biliary cirrhosis (as a consequence of the development of significant biliary stricturing) with its associated complications of bacterial cholangitis, and cholangiocarcinoma [1].PSC patients also experience cholestatic disease symptoms, BMC Medical Genomics of hepatic decompensation, death and need for liver transplantation [2][3][4][5][6][7][8][9][10][11][12][13][14][15][16].Biochemical improvement has also been demonstrated with metronidazole [17], vancomycin [17][18][19], bezafibrate [20], norursodeoxycholic acid [21] and obeticholic acid [22] but, again, no proven survival benefit has been demonstrated.Until recently, current guidelines did not recommend the use of any of these agents and the identification of a specific therapy for PSC is seen as an area of the highest priority [23][24][25].The 2022 European Association for the Study of the Liver (EASL) guidelines now recommend that UDCA can be used but acknowledge that the evidence for this recommendation is limited [25].
In this study, we use an in silico approach to identify potential novel therapy options for PSC, utilising the extensive, previously published findings regarding the genetic basis of the disease.Network Proximity Analysis (NPA) is a virtual method of exploring potential relationships between known drug targets and genes known to be associated with disease [26].The 'druggable genome' uses genome-wide association study (GWAS) data alongside established drug mechanisms to catalogue possible sites of interaction.The output from this analysis approach is a list of drugs that have a genetic target known to be proximal to the diseaseimplicated gene, that may have an effect on genetically encoded mechanisms of disease pathogenesis.This approach, already utilised in PBC [27], allows identification of treatments that have not been previously linked to PSC, and that could be repurposed from other indications.It offers particular potential for identifying a number of candidate agents that could be systematically evaluated in an 'adaptive' trial model, ideally suited for rare diseases where potential trial populations are by definition limited.

Identification of candidate genes
A systematic literature review was conducted in December 2020, initially searching PubMed for papers tagged with "Primary Sclerosing Cholangitis" and "GWAS", which identified 17 full publications.On review of these publications, and additional cited publications, 22 papers were identified.These comprised 11 GWAS studies in PSC and 11 review articles or GWAS in other disease areas.This search was repeated using MEDLINE, which identified no additional papers.Clinical trials in PSC were identified from ClinicalTrials.gov[28] and from the BSG [23] and AASLD [24] guidelines to cross-reference previously investigated agents, and this list was supplemented with trials reported elsewhere in the literature.This review was conducted by a single investigator.(See Fig. 1 for PRISMA diagram).
Collation of results yielded 89 unique single nucleotide polymorphisms (SNPs) associated with PSC.Human leukocyte antigen (HLA)-associated SNPs, those that did not achieve genome-wide significance (p-value of < 5 × 10 − 8 ) and those that did not suggest a relevant gene were excluded, leaving 26 unique genetic loci for analysis (Table 1, including duplicate association statistics) as reported in 8 studies (Supplementary Table 1).

Network proximity analysis
This study used the Python code [29] and drug-disease network validated by Guney et al. [30] to seek drug targets for the candidate genes.The method uses a previously published interactome network [31] and has demonstrated that drug target-disease proximity is a good marker of efficacy.As illustrated in Fig. 2, for each drug, the method calculates d c (the average of the distances to the closest disease associated gene for each drug target gene) and this is used to calculate a z-score (z=(d c -µ)/σ) using a randomisation procedure to empirically calculate µ and σ.The z-score end result is a score of drug-disease proximity for each of the drugs from the DrugBank [32] resource (a freely available drug database containing known genetic drug targets) (February 2021 version).Guney et al. [30] validated a cut-off for z-score of ≤ -0.15 to infer that the drug is proximal to the disease and may exert a pharmacological effect, based on known drug-disease effects.In order to identify compounds most strongly associated with PSC implicated pathways, and therefore that may be most clinically relevant, we chose to use a more stringent cut-off z score of -2.0.
The methodology used here was purely a secondary analysis of published data from previous studies, and did not involve any direct patient information.As such, no ethical approval was required.Previously published data about NPA in PBC was utilised in this study as a comparator for PSC [27] rather than collection of new PBC data.

Results
Network proximity analysis of 6296 compounds identified 2528 compounds with z scores ≤ -0.15 and 101 with z scores ≤ -2.0 for PSC (Supplementary Table 3), many of which are not medicinal products.Given that the focus of this study was to identify plausible candidate therapies, non-medicinal compounds were not considered further.A total of 42 medicinal products potentially appropriate for systemic therapeutic use showed a z score of ≤ -2.0 (Table 2).Of those, 23 are already licensed for another indication and therefore may be candidates for repurposing in PSC (denoted by * in the table).Only one identified compound (metronidazole) has, to our knowledge, been suggested as a potential therapy for PSC.
The agents already in clinical use for other indications with the lowest z scores, indicating very close proximity to a disease associated gene, are all immune modulators; Denileukin diftitox (-5.087),Basiliximab (-5.038),Abatacept (-3.787) and Balatacept (-3.73).Isosorbide, used in angina, was the only non-immunomodulatory agent with a highly proximal z-score (-3.116).
Table 3 lists the proximities of drugs currently or previously trialled in PSC (as recorded on ClinicalTrials.gov [28] or with published data) to evaluate whether they would have been identified as plausible candidates using NPA methodology i.e. likely to have an effect on the genetically encoded pathogenesis of PSC.There were 11 compounds with a z score ≤ -0.15 but only metronidazole had a z score ≤ -2.0.
Given the strong relationship between PSC and inflammatory bowel disease (IBD), we explored the proximity in NPA for PSC of agents that are already established in IBD therapeutics (Table 4).Corticosteroids were significantly proximal with budesonide having a z-score of -0.822 and prednisolone (in its various forms) having z-scores of ≤ -0.15.However, the other currently available treatments were not proximal, all having positive z scores.
Ozanimod, a sphingosine-1-phosphate receptor modulator, had a z score of -0.202 but remains in trial phase and is not currently licensed.It is important to note, however, that of the PSC studies in which the significant SNPs were identified, there was considerable heterogeneity in terms of comorbid IBD [33][34][35][36][37][38][39][40][41].In time, with further characterisation of the PSC-IBD phenotype and genotype, these groups may need to be stratified for further genetic studies.
Therapies have previously been extrapolated from PBC to PSC without success in terms of demonstrating survival benefit.When these NPA methods were applied in PBC [27], published data showed 2637 compounds with z values ≤ -0.15 identified and 253 with a z score ≤ -2.0.Of those with a z score ≤ -2.0, 109 were medicinal compounds.None of the therapies with confirmatory evidence of benefit from clinical trials in PBC had a z score of ≤ -2.0 (UDCA 0.171, obeticholic acid − 0.737.bezafibrate − 0.866, fenofibrate − 0.986) and UDCA did not meet the minimal threshold of significant proximity of -0.15, so was not identified as a proximal compound.
Table 5 lists the 20 candidates achieving significant proximity with a z score of ≤ -2.0 in both PBC and PSC that are already in use for another indication or under investigation.Supplementary Table 5 provides a full list of all compounds with a z score of ≤ -2.0 or better in either or both diseases.We again observe the biological agents seen earlier (Basiliximab, Balatacept, Abatacept, Denileukin diftitox) and a number of compounds utilised or under investigation in other immune-mediated diseases (psoriasis, inflammatory bowel disease, rheumatoid arthritis).The analysis identified non-biological agents that are proximal in both diseases (for example, the retinoids Arotinoid acid and Acitretin).

Discussion
In this "in silico" study we set out to use network proximity analysis to identify previously un-heralded candidate therapies for potential clinical evaluation in PSC, based on their likelihood of action on genetically-identified causative disease pathways.To our knowledge, this is the first time this approach has been used in PSC.The approach has been applied to a number of other chronic disease areas, including PBC [42], and has been proposed as a hypothesis-free methodology for identifying potentially valuable, novel approaches to therapy in disease areas with unmet clinical need.
While this method is known to demonstrate meaningful associations, it is important to note there is not necessarily implied directionality (drugs effective at these loci may worsen rather than improve the pathology), nor any guarantee the GWAS-identified genes are truly implicated, rather than associated due to linkage disequilibrium or a non-coding transcription regulation region.However, given the availability of the GWAS data and validation of this method in other disease areas, the method is certainly an appropriate source of potential candidate drugs in PSC therapy.Also included in Table 2 are the gene target descriptions for the associated drugs, with those known to be inhibitors/antagonists highlighted as likely to down-regulate expression of the implicated genes (of note, this would need to be further investigated prior to any clinical trials, as reducing expression of a regulatory gene, for example, may exacerbate disease).
With no currently licensed therapy, PSC is a disease with obviously unmet clinical need.It is also a rare and heterogeneous disease meaning that there are limited numbers of patients who can be recruited into clinical trials and the number of trials that can be conducted at any one time is restricted.This gives rise to "opportunity cost" in terms of less promising trials utilising the available patient pool and, as a result, preventing other trials of potentially more promising agents from being conducted.PSC is a condition in which novel potential therapeutic approaches are needed, in order to prioritise selection of agents for incorporation into trials.Network Proximity Analysis (NPA) is an approach that could potentially provide a solution to both of these challenges.By assessing the degree to which genetically encoded disease pathways showing a significant association with PSC co-map to predicted drug actions, NPA allows us to identify drugs that show a significant likely association with a disease-related pathway and which could therefore be novel candidate therapies.The converse is also true in that a drug with no apparent mechanistic effect on any PSC disease pathway might be less likely to be effective and thus a lower priority for trial evaluation.Using this approach, we identified a number of drugs with PSC
When degree of proximity was assessed for therapies already trialled in PSC there was a striking lack of association.The only agent showing a z-score < -2.0 (a strong association, albeit markedly weaker than the drugs outlined above) was metronidazole.This drug has been shown to give modest improvement in liver biochemistry in a combination study with Ursodeoxycholic Acid [43].The 2004 study of 80 patients randomised to either metronidazole or placebo in combination with UDCA showed improvement of liver biochemistry in both groups, with alkaline phosphatase (ALP) significantly more reduced in the metronidazole group (p < 0.05) after 36 months (although there was no significant impact on disease progression or long-term follow up).Weaker, but still significant, proximity was seen for Vancomycin (-0.967) and Bezafibrate (-0.434); drugs shown to have some benefit in terms of biochemical improvement in Fig. 2 For each drug, the known target genes (nodes of the same colour) are linked to their nearest disease-associated genes (white nodes with black edging) to calculate the "distance" d c between the drug and the disease.For Drug 1 (blue), the distance is the average of the four blue pathways (the distances from each of the drug target genes to the nearest disease associated gene) i.e.PSC [17][18][19][20].In contrast, UDCA (widely used in PSC although with no confirmed evidence of survival benefit) and Obeticholic Acid (OCA; a licensed second-line therapy in PBC and previously trialled in PSC) were not significantly proximal with z scores of 1.042 and 1.170, respectively.
Comparison of network proximity for PSC and PBC, diseases that have clinical features in common and in which overlapping therapy approaches have been explored (with varying degrees of success), shows interesting similarities and differences in the potential therapies identified.Medications previously trialled in PSC (including UDCA and OCA) that have been 'borrowed' from PBC for their effects on cholestasis appear to have no genome-level basis to their effect, potentially highlighting cholestasis as a common end-pathway to two different pathologies.Overall, PBC NPA identifies more candidate agents than PSC, including whole classes of drugs, such as the kinase inhibitors that are strong candidates in PBC.The approach does identify a number of un-anticipated agents that are candidates in either PSC alone or in both PBC and PSC (exemplified by isosorbide    [21], cenicriviroc [44] and vidofludimus [45] calcium) so were not included in this study.This means the results of this analysis will be dynamic as both the drug-disease network and the DrugBank resource are updated and refined.
Where, if anywhere, does the analysis presented here leave us with regard to PSC therapy?The approach is a seductive one; generating an intriguing list of drugs that we could evaluate in what is currently an untreatable disease.It is one, however, with a number of important caveats.The first caveat is that although the approach has been taken in a number of disease areas (including PBC), the result has been the same each time; a list of interesting drugs but no progress beyond that.The next step is to incorporate these candidate drugs into a real-world clinical trial, and to formally test the hypothesis that the NPA approach in silico identifies drugs that have actual therapeutic effects in the target disease in clinical trials.Until this happens the approach remains an interesting side-line.The second caveat about the NPA approach is that whilst it shows a relationship between a molecule's modes of action and a disease pathway, it does not tell us the direction of that relationship clinically.It is conceivable that the approach identifies a drug that actually worsens rather than improves a disease.This needs to be remembered (and ideally explored theoretically prior to implementation) if and when we move from this analysis to a clinical trial.The third caveat is that by its nature the approach only addresses the genetic component of a disease.PSC, in common with most chronic inflammatory diseases, has both a genetic and an environmental component to its aetiology.However, as demonstrated in PBC (UDCA is the mainstay of treatment with proven benefit, but a non-proximal z-score of 0.171), non-proximity at a  genome level does not rule out drug efficacy and NPA is not a method to retrospectively validate treatments.Any therapy that would work on an environmental component will not be flagged up using this analysis approach.
An important example might be a therapy modulating the microbiome in PSC.In this regard it is interesting that metronidazole and vancomycin are flagged up and yet might be expected to work on the environmental arm of aetiology.Their identification as candidates through NPA raises the interesting possibility that their mode of action might be unrelated to their anti-microbial actions.
The final caveat is that the approach may identify a candidate drug but it does not tell you how and when to use it.This is exemplified by Ustekinumab in PBC; a very strong candidate targeting a disease pathway strongly associated with PBC.The clinical trial of the drug in PBC was, however, negative [46].One explanation for this apparent paradox would be that NPA in fact doesn't reliably identify candidate therapies.The alternative might be that the Ustekinumab trial targeted people who had failed UDCA therapy (i.e.people with "downstream", cholestasis-driven disease rather than disease in an "upstream", immune stage).Given the immune-modulatory nature of almost all of the drugs identified for PSC in this analysis, the lessons of the PBC Ustekinumab experience for future trial design in PSC are clear.While there are important caveats to remember, this method has identified drugs with known safety profiles that would be potential candidates for trials in PSC; a disease with otherwise no effective therapeutic options.This in silico exploration of therapeutics is a safe and novel way to identify candidate drugs to optimise efforts in rare disease trials and would form a helpful basis for further research into the use of metronidazole or initially use of biologics such as basiliximab, abatacept or belatacept.

Fig. 1
Fig. 1 PRISMA diagram for systemic literature review Fig.2For each drug, the known target genes (nodes of the same colour) are linked to their nearest disease-associated genes (white nodes with black edging) to calculate the "distance" d c between the drug and the disease.For Drug 1 (blue), the distance is the average of the four blue pathways (the distances from each of the drug target genes to the nearest disease associated gene) i.e. d c = 1.5.Drug 2 (green) has only two target genes but the same d c = 1.5.Drug 3 (orange) has four target genes which are all quite distal and has d c = 3.25.Drug 4 (purple) has two drug target genes closest to PSC3 and one closest to PSC2, with an overall d c = 1.The final relative proximity measure z between each drug and the disease is calculated as z=(d c -µ)/σ where µ and σ are calculated empirically via a randomisation procedure

Table 1
Summary of systematic literature review, identifying 26 unique genes for analysis

Table 4
PSC z scores for drugs used in inflammatory bowel disease

Table 5
Compounds with z score ≤-2.0 for both PSC and PSC with current use or under investigation